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Abstract 

We present an approach to writing and formally verify- 
ing high-assurance file-system code in a restricted lan- 
guage called Cogent, supported by a certifying compiler 
that produces C code, high-level specification of COGENT, 
and translation correctness proofs. The language is strongly 
typed and guarantees absence of a number of common file 
system implementation errors. We show how verification ef- 
fort is drastically reduced for proving higher-level proper- 
ties of the file system implementation by reasoning about 
the generated formal specification rather than its low-level C 
code. We use the framework to write two Linux file systems, 
and compare their performance with their native C imple- 
mentations. 

Categories and Subject Descriptors D.4.5 [Operating 
Systems]'. Reliability — Verification; D.2.4 [Software Engi- 
neering ]: Software / Program Verification — Formal meth- 
ods; D.3.2 [Programming Languages ]: Language Classi- 
fication — Applicative (functional) languages 

Keywords file systems; verification; domain-specific lan- 
guages; co-generation; Isabelle/HOL 

1. Introduction 

Operating systems (OS) code is critical for the dependability 
and security of computer systems. In monolithic systems, 
most OS services are part of the kernel and are included in 
the trusted computing base (TCB) of any application. While 
microkernels can help reduce the TCB [10, 17, 29, 47], in 
practice we need more, because most applications depend on 
the correct operation of a significant number of OS services. 
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File systems, which constitute the largest fraction of code 
in Linux after device drivers, have among the highest defect 
density of Linux kernel code [38]. Furthermore, new file sys- 
tems are frequently added for new media types or for new 
performance and reliability trade-offs. The lack of support 
for correctly writing new file systems is a standard bottle- 
neck. 

Formal verification in the style of seL4 [26] or Ironclad 
[15] is the only approach to fully solve this problem. While 
formal verification for file systems has seen a surge in ac- 
tivity in recent years [9, 11, 18], it has however not yet 
reached that same level of detail and assurance. In the most 
advanced and deepest of these verifications, Chen et al. [8] 
formally shows crash-resilience of the FSCQ file system im- 
plemented in the Coq prover [6]. Even this verification still 
relies on generating Haskell code from the Coq implemen- 
tation, and executing that generated code with a full Haskell 
run-time at user level. This proof crosses an important bar- 
rier for formal verification: it removes human intervention 
from the link between the code that runs and the model that 
is verified. However, the trusted code base is still huge: the 
run-time is larger than the file system, the code generation 
step from Coq is unverified, and the semantics of the tar- 
get language (Haskell) is complex and informal. While cer- 
tainly an impressive result, for high-assurance systems on 
potentially resource-constrained devices, this is not yet good 
enough. 

As motivated in [23], we design a new language called 
Cogent to bridge the gap between verifiable formal model 
and low-level code. In separate work [36], we define the 
formal semantics of COGENT, describe the COGENT com- 
piler in detail which generates C code and an Isabelle/HOL 
specification, and show how the seven main stages of au- 
tomatic compiler-produced proof work that connect the C 
and the Isabelle/HOL specification. In this paper, we show 
how to reason about COGENT programs, how COGENT is 
used to implement two Linux file systems, and we analyse 
how COGENT reduces the effort of reasoning about a low- 
level C implementation in order to more productively rea- 
son about high-level functional specifications — similar in 
style to Chen et al’s verification. To this end, we verified two 


high-level functional correctness properties of one of the two 
Linux file systems. These are properties that would form key 
building blocks in, for instance, a proof of crash resilience. 

COGENT goes further than just bridging the gap from 
model to code. It also provides tools for file system program- 
mers without any formal verification expertise to provably 
avoid common errors: The language is type safe, i.e. the type 
system is strictly enforced. The compiler and type system in 
turn enforces basics like memory safety, but also the absence 
of any undefined behaviour on the C level, null pointer deref- 
erences, buffer overflows, memory leaks, and pointer mis- 
management in error handling. Mismanagement of pointers 
in error-handling code is a wide-spread problem in Linux file 
systems specifically [41], and Saha et al. [42] shows that file 
systems have among the highest density of error-handling 
code in Linux. In the verification of seL4, a significant frac- 
tion of the overall effort went into proving absence of such 
errors [25] - COGENT’s linear type system [49] provides this 
for free and assists the programmer with memory alloca- 
tion handling. Memory safety and memory leaks are prob- 
lems with file systems, but also more generally in systems 
code [43], The infamous Heartbleed bug for instance, was 
a buffer overflow [16], the recent “goto-fail” defect in Ap- 
ple’s SSL/TLS implementation was an error-handling prob- 
lem obscured by gotos in an if-cascade [37], and the recent 
memory leak in Android Lollipop also was part of error- 
handling code [30]. All of these problems are prevented on 
the language level by COGENT, and are enforced by its certi- 
fying compiler. Few other languages come with type-system 
guarantees that address problems like memory leaks without 
significant overhead or large language run-times (e.g. Rust), 
and none come with a formal semantics and a machine- 
checked proof of their guarantees. 

Cogent compiles to straight C code that can be com- 
piled by gcc or CompCert [27, 28]. The generated C code 
also falls into the fragment understood by Sewell et al.’s gcc 
translation validation tool [46], providing a pathway to ex- 
tend the verification all the way to binary level if needed. 
We designed COGENT with the code patterns and common 
errors in Linux file systems in mind. As detailed in sepa- 
rate work [36], its semantics is sequential (allowing asyn- 
chronous I/O, but not full concurrency), restricted to total 
functions, and contains no built-in loops or recursion. This 
simplifies reasoning, both for the compiler and on top of 
the language. Iterators, external abstract functions, and types 
that rely on sharing, are implemented in a formally modelled 
foreign function interface, supported by a custom template- 
style C extension. In our file system implementation, a small 
library of abstract data types (ADTs) and iterators is suffi- 
cient, and the foreign-function interface is powerful enough 
to provide interoperability with an existing red-black tree 
implementation in C. 

In summary, we make the following contributions: 


1. We demonstrate the use of COGENT for implementing 
actual file systems (Section 3). 

2. We present proofs of high-level correctness properties of 
a file system implemented in Cogent and discuss how 
Cogent facilitates such proofs (Section 4). 

3. We discuss the effort involved in implementing file sys- 
tems with verified properties in Cogent (Section 5). 

4. We evaluate the COGENT-implemented file systems 
against their native C counterparts (Section 5). 

2. Overview of Our Approach 

2.1 Cogent 

Cogent is tailored to writing systems code, with a specific 
focus on file systems. It is intentionally more restrictive 
than general-purpose functional languages, to enable the use 
of efficient and powerful reasoning techniques on its high- 
level semantics, to avoid the need for an elaborate language 
run-time, and to enable the COGENT compiler to output 
an automatic proof that the C code it generates correctly 
implements the behaviour of the COGENT program [36], In 
this paper, we further demonstrate how to prove functional 
correctness of the COGENT program. 

Cogent manages to avoid a garbage collector, unlike 
most existing high-level languages such as Java, Haskell, 
and OCaml, by implementing a linear type system [49], 
which ensures that each linearly typed object is used exactly 
once. The type system not only provides memory safety, but 
also makes memory leaks compile-time errors. Traditional 
pure functional languages, such as Haskell, that disallow 
side effects, favour frequent copies of dynamically allocated 
data structures and make it hard to reason about the perfor- 
mance of the generated code. Cogent’s linear type sys- 
tem increases performance by allowing the compiled code 
to modify data structures in-place. The type system sup- 
ports parametric polymorphism. While doing so might not 
be an obvious choice for a language geared towards systems 
programming, it is particularly important in the context of 
Cogent, which relies on external ADTs to implement data 
structures that cannot be implemented with linear types. 

COGENT also supports higher-order functions. In contrast 
to general-purpose functional languages, however, all func- 
tions in Cogent have to be defined on the top-level. This 
restriction is necessary to avoid the run-time overhead and 
heap allocations necessary to implement closures otherwise. 

As an example. Figure 1 depicts a snippet of CO- 
GENT from our ext2 implementation, specifically the 
ext2_inode_get () function. This function looks up an in- 
ode on disk, given its inode number inum. Lines 3^f declare 
the type of this function. It takes 3 arguments: an ExState, 
which is a reference to an ADT embodying the outside en- 
vironment of the file system; an FsState, a reference to 
a Cogent structure that holds the file system state, and 
a U32, an unsigned 32-bit integer that in this case is the 


1 type RR c a b = (c, <Success a I Error b>) 

2 

3 ext2_inode_get : (ExState , FsState , U32) -> 

4 RR (ExState, FsState) (Vfslnode) (U32) 

5 ext2_inode_get (ex, state, inum) = 

6 let ((ex, state), res) = 

7 ext 2_inode_get _buf (ex, state, inum) 

8 in res 

9 | Success (buf _blk , offset) -> 

10 — read the inode , from the offset 

11 let ((ex, state), res) = 

12 deserialise_Inode (ex, state, 

13 buf _blk , offset , 

14 inum) 

is ! buf _blk 

16 in res 

17 I Success inode -> 

is let ex = 

19 osbuff er_destroy (ex, buf_blk) 

20 in ((ex, state), Success inode) 

21 | Error () -> 

22 let ex = 

23 osbuff er_destroy (ex, buf_blk) 

24 in ((ex, state), Error elO) 

25 I Error (err) -> ((ex, state). Error err) 

26 

27 osbuf f er_destroy :( ExState , OsBuf f er ) ->ExState 

Figure 1. Looking up an inode Ext2, in COGENT. 


number of the inode to be looked up. It returns a result of 
type RR (defined on line 1), which is a pair. Its first com- 
ponent c is mandatory data that must always be returned, 
while the second component is a tagged union value which 
signals whether the function succeeded (Success a) or not 
(Error b), where a and b are result values for the respec- 
tive cases. ext2_inode_get () (line 4) always returns both 
the ExState and FsState; when successful it also returns 
a Vfslnode reference to the looked-up inode; otherwise it 
instead returns also a U32 error code. 

ext2_inode_get () first calls (line 6-7) the COGENT 
function ext2_inode_get_buf (), which, after more calcu- 
lation, internally calls an ADT function to read the corre- 
sponding block from disk. We match (lines 8, 9, 25) on the 
result res to make a case distinction; for Success (line 9), 
the result contains a reference buf _blk to a buffer, plus the 
offset offset of the inode in that buffer. We use these to 
deserialise the inode from the buffer into a Vfslnode struc- 
ture. 

In this situation, linear types are unnecessarily restrictive. 
Pas sing the linear value buf _b lk to deserialise _I no de ( ) 
in line 12 would consume it, so deserialise_Inode () 
would have to return it as an additional result for it to be 
used again later on in the program. This is overly compli- 
cated, given that deserialise_Inode() never modifies the 
buffer. The “!” operator (in “!buf_blk”, line 15) allows us 
temporarily to escape the use-once restriction of the linear 
type system: it indicates that within the right-hand side of = 
here, buf _blk is used read-only and can therefore be refer- 


enced multiple times without consuming it. The type system 
prevents deserialise_Inode () from modifying the buffer 
that is holding the inode being deserialised. Moreover, it en- 
sures that no references to buffer or parts of it can be re- 
turned, as this would result in aliasing. 

For the Error case (line 25), the result includes 
an error code that is simply propagated to the caller 
along with the return values that were mandatory. The 
Success/Error case distinction is repeated following the 
attempt to deserialise the inode from the buffer. In either 
case, the buffer must be released, by calling the ADT func- 
tion osbuf fer_destroy(), whose type signature (line 27) 
suggests it consumes the buffer. 

Note that in this function, COGENT’s linear type sys- 
tem would flag an error if the buffer buf_blk was never 
released. ext2_inode_get () will only type-check if all of 
the Error cases are handled. Also observe that the manda- 
tory ExState and FsState values are threaded through 
the function. The ExState value encompasses all exter- 
nal state, and so includes for instance Linux’s buffer cache. 
Because COGENT is a pure functional language, it makes 
side effects explicit. This means this ExState is passed to 
osbuff er_destroy() explicitly, which then returns a new 
ExState reference that encompasses the external state of 
the world after the buffer has been released. This does not 
mean that large structures are copied at run-time. In practice, 
Cogent’s linear type system allows the ExState update in 
osbuff er_destroy() to be performed in-place. The same 
is true for the other functions that each consume and return 
an ExState or FsState. 

It is obvious from Figure 1 that COGENT’s current sur- 
face syntax for error handling is unnecessarily verbose. The 
textual overhead for the regular error handling pattern can 
easily be reduced by additional syntactic sugar in a future 
language iteration without touching the verification infras- 
tructure - we merely focused on verification first. 

2.2 Cogent File Systems 

We implement two hie systems in COGENT, to drive the de- 
velopment of the language and as case studies for automatic 
proof generation. The first is an almost feature-complete re- 
vision 1 ext2 implementation: it passes the Posix File Sys- 
tem Test Suite [35], except for the ACL and symlink tests (as 
we have not implemented those features). Its performance is 
comparable to Linux’s native ext2 implementation. For con- 
trast, the second is a new hash hie system called BilbyFs [23] 
whose design strikes a balance between the simplicity of 
JFFS2 and the performance of UBIFS, the two most popular 
Linux Hash hie systems. 

The COGENT implementations of ext2 and BilbyFs share 
a common library of ADTs that includes hxed-length arrays 
for words and structures, simple iterators for implement- 
ing for-loops, and COGENT stubs for accessing a range of 
Linux APIs such as the buffer cache and its native red-black 
tree implementation, plus iterators for each ADT. The in- 



Figure 2. Code/Proof Co-Generation Process. 


terfaces exposed by these ADTs are carefully designed to 
ensure compatibility with Cogent’s linear type system. 

For full functional correctness, these ADTs need to be 
verified separately. We have done so for a smaller instance 
(the WordArray ADT) to validate the cross-language se- 
mantics, but have not verified for instance the red-black tree 
implementation, although it is certainly feasible to do so. For 
example, a verification of a red-black tree implementation 
(although in a higher-order logic and thus not usable in our 
framework) is part of the Isabelle/HOL [34] standard library. 

The ext2 implementation showcases Cogent’s ability to 
enable the re-engineering of existing file systems, and thus 
its potential to provide an incremental upgrade path to in- 
crease the reliability of existing systems code. BilbyFs, on 
the other hand, provides a glimpse of how to design and en- 
gineer new file systems that are not only performant, but 
amenable to being verified as correct against a high-level 
specification of file system correctness. To this end, Bil- 
byFs pursues aggressive modular decomposition, whereas 
the structure of the ext2 implementation mirrors that of its 
native Linux counterpart. Section 3 presents the two file sys- 
tems in more detail. 

2.3 Automatic Code/Proof Co-Generation 

Details of the code/proof co-generation form the topic of 
separate work [36]. We provide a brief summary below. 

Figure 2 shows the code and proof generation process: the 
COGENT compiler generates C code, which is then compiled 
by some C compiler, linking against the ADT library, to pro- 
duce binary hies. Besides the C code, the COGENT compiler 
also generates a formal specification in Isabelle/HOL that 
precisely encodes the semantics of the source COGENT pro- 
gram, and a refinement proof that this semantics is correctly 
implemented by the C code. The specification can be used to 
prove higher-level properties about the COGENT program. 

The generated C code can be compiled by standard com- 
pilers such as gcc, and also CompCert. We assign the gener- 
ated C language subset a semantics, which is from Norrish’s 
C Parser [50], originally developed for the seL4 verification. 
Since Norrish’s C Parser also supplies the C semantics for 
Sewell et al.’s gcc translation validation tool [46], we can in 


future extend the compiler-generated proofs for the C code 
to the binary level. 

The refinement proofs state that every behaviour exhib- 
ited by the C code can also be exhibited by the COGENT code 
and, furthermore, that the C code is always well-defined, in- 
cluding that e.g. the generated C code never dereferences a 
null pointer, and never causes signed overflow. It also im- 
plies that the generated C code is type-safe and memory- 
safe, meaning the code will never try to dereference an in- 
valid pointer, or try to dereference two aliasing pointers of 
incompatible types. In conjunction with the COGENT typing 
proofs, generated by the COGENT compiler for the input pro- 
gram, we get additional guarantees that the generated code 
handles all error cases, is free of memory leaks, and never 
double-frees a pointer. 

These proofs do not guarantee, for instance, that the im- 
plementation always behaves as a proper file system — e.g. 
that data written to disk will always be able to be read back. 
Providing that kind of additional guarantee requires proving 
the Cogent code functionally correct against a high-level 
specification of file system functionality. 

In later sections, we show how to reason on top of the CO- 
GENT formal specification in Isabelle/HOL, leveraging Co- 
GENT’s purely functional semantics. Reasoning here is sim- 
pler than reasoning on C code directly because the COGENT 
semantics is represented as pure functions in the logic, mak- 
ing it possible to reason equationally about it via Isabelle’s 
powerful rewriting engine. We return to this point in Sec- 
tion 4. 

If desired, one could further formalise different logics on 
top of our generated COGENT specification to further sim- 
plify reasoning about different domain-specific properties. 
For example, a Crash Hoare Logic [8] to simplify reasoning 
about crash safety. 


3. File Systems in Cogent 

As mentioned in Section 2, we have implemented two file 
systems in COGENT: an ext2 implementation, and a new 
raw flash file system called BilbyFs. Both file system im- 
plementations sit below Linux’s virtual file system switch 
(VFS) module, using C stub code to provide the top-level en- 
try points expected by the VFS. These C stubs call directly 
into the top-level COGENT functions that implement each 
file system, using locking to prevent two COGENT functions 
from executing concurrently (because our current verifica- 
tion technology does not support concurrency). 

An ADT provides a common interface to the VFS within 
Cogent, while each file system uses a separate ADT for 
interfacing with the physical storage medium, as these differ 
between the two. Both ADTs capture all relevant properties 
of the underlying media, and should be fully re-usable for 
other file systems targeting the same media. 



3.1 ext2 

ext2 is a well-known file system for block devices. While on 
large block devices it has long been supplanted by journaling 
file systems, which provide better reliability guarantees in 
the event of a crash, it remains popular for smaller block 
devices like SD cards and USB flash memory. 

The Cogent ext2 implementation follows ext2fs of 
Linux 3.14.12; essentially we transliterated the Linux im- 
plementation into Cogent. This approach tests Cogent’s 
ability to let systems programmers (re)implement their code, 
and in the process gain (proof-) guaranteed type- and 
memory-safety, as well as absence of undefined behaviour, 
missing error cases and memory leaks (as discussed in Sec- 
tion 1). 

In its current state, our ext2 implementation has a num- 
ber of limitations compared to Linux ext2fs. Each of these 
is straightforward to remove, except the lock that forces the 
code to run without concurrency. It emulates an early ver- 
sion (revision 1) of ext2, with lk blocks and 128byte inodes. 
It also does not yet support read-ahead or direct-IO, and uses 
a simpler block allocation algorithm than Linux, so the or- 
der of blocks on disk is different, leading to different seek 
times, and different on-disk fragmentation. The implemen- 
tation also currently elides extended attributes, quotas, re- 
served blocks and symlinks; however, none of these features 
are exercised by the benchmarks presented in Section 5. 

3.2 BilbyFs 

BilbyFs is meant to serve as a case study in how to design 
new file systems that are fully verifiable, meaning that not 
only is their generated C code proved to always implement 
their COGENT semantics, but that their COGENT semantics 
can also be verified to satisfy high-level correctness proper- 
ties, such as full functional correctness against an abstract 
specification of file system correctness. 

Unlike ext2, BilbyFs is not a block device file system. In- 
stead it is a file system for raw flash devices, like those on 
embedded devices. BilbyFs’ design draws inspiration from 
the two most popular raw flash file systems, JFFS2, which 
was merged into Linux in 2001; and the newer UBIFS, 
which was merged in 2008. BilbyFs’ design balances the 
simplicity of JFFS2 with the run-time performance proper- 
ties of UBIFS. Like each of these file systems, BilbyFs is 
a log-structured file system. Like JFFS2 and UBIFS, it pro- 
vides crash-tolerance by structuring flash updates in atomic 
transactions, and discarding incomplete transactions when 
re-mounting following a crash. 

Like UBIFS, BilbyFs writes data to the flash asyn- 
chronously, allowing otherwise small writes to be batched 
into large transactions to improve metadata packing and 
throughput [19]. Like JFFS2, however, BilbyFs eschews 
storing the flash index , which records the on-flash location 
of each file system object, on the flash. Instead it maintains 
the index in memory. This means that the index must be re- 
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Figure 3. The modular design of BilbyFs. 


constructed at mount time, and limits the maximum medium 
size that can be supported. 

As mentioned, BilbyFs’ design is rooted in aggressive 
modular decomposition, to reduce the cost of completing a 
manual proof of full functional correctness of its COGENT 
code, building on top of the automatic correctness proofs 
generated by the COGENT compiler. Recent research [32] 
provides evidence suggesting that the effort required for 
manual verification scales quadratically with the size of the 
statement to be proved. This indicates that increasing modu- 
larity will reduce the overall verification effort. 

BilbyFs’ modular design is depicted in Figure 3. At the 
bottom level, BilbyFs interfaces with Linux’s UBI compo- 
nent via a Cogent ADT. It uses UBI to read and write the 
flash, allowing UBI to handle wear levelling and manage 
logical erase blocks as it does for UBIFS. The in-memory 
Index component, one level up, implemented in COGENT, 
tracks the on-flash address of each file system object. The 
ObjectStore uses this Index to provide an abstract interface 
for reading and writing generic objects on flash. To do so it 
also makes use of the FreeSpaceManager, which in turn is 
used by the GarbageCollector to free erase blocks no longer 
in use. Finally, the FsOperations component implements the 
top-level file system operations and objects, like inodes di- 
rectory entries and data blocks. This decomposition ensures 
that the key file system logic is confined to the FsOperations 
component, while the physical representation of objects on 
flash is handled by the ObjectStore. 

3.3 Reusable Abstract Data Types 

As mentioned, the two file systems share a common ADT 
library (7 ADTs in total). It includes the WordArray type 
mentioned in Section 2, for arrays of primitive words; a sep- 
arate polymorphic Array type for (linear) heap values; com- 
mon utility functions like iterators for implementing for- 
loops with early exit and accumulators, and (inline) func- 
tions for manipulating machine words; a heapsort implemen- 
tation; and polymorphic linked lists. It also includes stubs for 
accessing existing kernel APIs, including the buffer cache 
(e.g. osbuf f er_destroy () from Figure 1), the page cache, 
a native red-black tree implementation, checksum functions, 
time and date functions, and the VFS. 


Some care is needed when designing ADTs that respect 
and enforce the constraints of COGENT’s linear type system. 
For example, the function for accessing an element of the 
general polymorphic Array type must make sure that the el- 
ement cannot be accessed a second time, inadvertently giv- 
ing two writable references to a single value, which violates 
the constraints of the linear type system. This is why we have 
a separate WordArray type for strings of (non-linear) ma- 
chine words. When accessing array-elements read-only, no 
such removal is needed: the type system guarantees that the 
aliasing is safe in such cases. 

While designing and implementing ADTs thus requires 
some skill, it should be a relatively rare activity because 
ADTs are freely reusable. Given that our current ADT li- 
brary is shared between the two relatively different file sys- 
tems, we expect that it should be completely reusable for 
other file system implementations as well. 

4. Formal Verification 

Recall from Section 2.3 that the COGENT compiler gener- 
ates C code as well as a high-level Isabelle/HOL specifica- 
tion from the input COGENT program. In this section, we 
describe how this high-level specification facilitates further, 
formal reasoning at much reduced effort compared to tradi- 
tional functional correctness verification as typified by e.g. 
seL4 [26]. To this end, we describe two BilbyFs high-level 
functional correctness properties that we proved. 

4.1 Top-Level Correctness Specification 

These proofs show that the sync ( ) and iget ( ) operations 
of BilbyFs are functionally correct, meaning that they be- 
have correctly in accordance with a top-level, abstract spec- 
ification for these operations. This abstract specification is 
written directly in Isabelle’s higher-order logic. It is short 
enough that a human can audit it to ensure that it accurately 
captures the intended behaviour of these operations. A com- 
plete description of our top-level specifications is available 
in separate work [3]. The top-level specifications for sync () 
and iget ( ) are depicted in Figure 4. The total line count of 
both specifications including all their dependencies is 239 
lines of Isabelle/HOL. 

sync ( ) and iget ( ) each implement the corresponding 
functions expected of the Linux VFS layer. The sync () op- 
eration synchronises the current in-memory state of the file 
system to physical storage. The in-memory state may dif- 
fer from the physical state because, as mentioned in Sec- 
tion 3.2, BilbyFs buffers pending writes in memory to im- 
prove metadata packing and throughput. The top-level ab- 
stract file system (AFS) specification for syncO, afs_sync, 
operates over the abstract file system state afs, which tracks 
the state of the physical storage medium (med), the pend- 
ing in-memory medium updates (updates), and whether the 
file-system is currently read-only (is_readonly). The speci- 
fication says that sync () first checks whether the file system 


is read-only, in which case an appropriate Error code is re- 
turned with the file system state unchanged (lines 2 and 3). 
Otherwise, it applies the in-memory updates to the physical 
medium. 

Its specification is sufficiently nondeterministic to cap- 
ture the behaviour of a correct file system under the situa- 
tion when the in-memory updates are only partially applied, 
perhaps because of a flash device failure part-way through. 
For this reason, the specification allows any number of up- 
dates n (line 5) to succeed, between 0 and the total number 
of updates currently in-memory (i.e. length (updates afs)). 
It then (lines 8 and 9) applies the first n updates toapply to 
the physical medium med afs of file system state afs, and 
remembers the updates that remain to be applied rem. If all 
updates were applied, it returns Successfully, yielding the 
new file system state (lines 10 and 11). Otherwise (lines 12 
to 14), it returns an appropriate error code, selected nonde- 
terministically because the specification abstracts away from 
the precise reason why the failure might have occurred. In 
case of an I/O error (elO), the file system is also put into 
read-only mode. 

The iget() operation looks-up inodes on the physical 
medium. It takes a inode number inum and a VFS inode 
structure vnode. It first (line 2) checks whether an inode 
with the given inode number exists in the file system. To 
do so it must consult both the in-memory and on-medium 
state, computing via the expression updated_afs afs what 
the file system state would be if it were synchronised to 
the medium. If the inode number is not present (lines 11 
and 12) an appropriate error code is returned. Otherwise, 
the iget ( ) specification reads the inode with that number 
from the medium (line 4) and then returns appropriately 
based on whether the inode read succeeded (lines 6 and 7) or 
produced an error (lines 8 an 9). In the case of Success, the 
inode must be converted to a VFS inode structure for inter- 
operating with the Linux VFS. Observe that the iget() 
specification does not return an updated afs structure: thus 
its type signature automatically captures that it can never 
modify the abstract file system state. 

Note that the igetO specification intentionally elides 
interaction with the Linux inode cache. These are managed 
by a trivial amount of C code that sits between the Linux 
VFS layer and the BilbyFs implementation as produced by 
the COGENT compiler. Incorporating them into the COGENT 
implementation of BilbyFs (and thus into e.g., the igetO 
specification) would add little value, as these caches would 
be just large unverified ADTs and opaque to COGENT and 
the verification. 

4.2 Functional Correctness Proof 

We prove the correctness of the BilbyFs sync ( ) and iget ( ) 
operations, against their top-level specifications of Figure 4. 
The proof follows the modular decomposition of the BilbyFs 
implementation from Figure 3 (see Section 3.2). For both 


1 afs_sync afs = 

2 if is_readonly afs then 

3 return (a/s, Error eRoFs) 

4 else do 

5 n <— selectjO.. length (updates a/s)}; 

6 let updates = updates a/s; 

7 ( toapply , rem) = (take n updates , drop n updates ); 

8 afs = (a/s(|med := apply_updates toapply (med a/s), 

9 updates := rem\)); 

10 in if rem = [] then 

n return (a/s, Success ()) 

12 else do 

13 e 5— select {elO, eNoMem, eNoSpc, eOverflow}; 

14 return ( afs (|is_readonly := (e = elO)|), Error e) 

15 od 

16 od 


1 afsJget afs inum vnode = 

2 (if inum £ dom (updated_afs a/s) then 

3 do 

4 r 4— read_afs_inode afs inum ; 

5 case r of 

6 Success inode => 

7 return (inode2vnode inode , Success ()) 

8 | Error e => 

9 return (vnode, Error e) 

10 od 

11 else 

12 return (vnode, Error eNoEnt)) 


Figure 4. Top-level specifications for sync ( ) and iget ( ) , against which their BilbyFs COGENT implementations are verified. 
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Figure 5. Modular functional correctness proof of BilbyFs. 


operations, we prove the full component chain up to the UBI 
component. 

Figure 5 depicts the structure of this proof. At the top, 
we prove that the code for the VFS -facing component of 
BilbyFs, called FsOperations, correctly refines its high-level 
specification as given in the abstract file system specifica- 
tion (AFS) of Figure 4. To complete this proof we formally 
define how to map the abstract file system state (afs of Fig- 
ure 4) to the concrete file system state as implemented in 
COGENT. For instance, the abstract in-memory list of up- 
dates (updates afs of Figure 4) corresponds to an in-memory 
buffer where pending medium-writes are stored. The map- 
ping from the former to the latter requires parsing the physi- 
cal buffer contents (a list of bytes). In addition, the physical 
medium state (med afs) corresponds to the state of the UBI 
component of BilbyFs, which provides the interface to the 
flash storage. Similar to the in-memory buffer, mapping the 
state of the UBI component to its abstract representation in 


the AFS requires logically mimicking the file system mount 
operation, to parse the entire physical medium. In this sense, 
our proofs deal directly with the raw bytes stored in-memory 
and on-flash by the file system. 

The FsOperations proof relies on formal assumptions 
about the ObjectStore. Those assumptions form an ax- 
iomatic specification of the ObjectStore, which serves as a 
compact representation of its correctness that abstracts away 
from its implementation details. 

For sync ( ) , the proof proceeds by establishing that the 
ObjectStore implementation indeed satisfies these axioms. 
This proof in turn makes use of axiomatic specifications for 
the components on which the ObjectStore relies, the Index 
and FreeSpaceManager, whose COGENT implementations 
we also prove satisfy their axioms. The proof bottoms out 
at the UBI component which, being entirely abstract, is 
captured by an axiomatic specification only. The validity 
of the entire functional correctness proof then rests on the 
validity of the assumptions encoded in the UBI specification. 
We return to these assumptions later in Section 4.4. 

4.3 Ease of Proof 

While they establish a similar property, these proofs are 
far simpler than e.g. the comparable functional correctness 
proofs of seL4 [26]. Just as with seL4, the functional cor- 
rectness proof here relies on establishing global invariants 
of the abstract specification and its implementation. For Bil- 
byFs, the invariants include e.g. the absence of link cycles, 
dangling links and the correctness of link counts, as well as 
the consistency of information that is duplicated in the file 
system for efficiency. 

Importantly, unlike with seL4, none of the invariants have 
to include that in-memory objects do not overlap, or that 
object-pointers are correctly aligned and do point to valid 
objects. All of these details are handled automatically by 


Cogent’s type system and are justified by the C code cor- 
rectness proofs generated by the COGENT compiler. 

Even better, when proving that the file system correctly 
maintains its invariants, we get to reason over pure, func- 
tional specifications of the COGENT code. Because they are 
pure functions, these specifications do not deal with muta- 
ble state (as e.g., the seL4 ones do). Thus when proving that 
they satisfy their invariants, or refine the top-level AFS, we 
need not resort to cumbersome machinery like separation 
logic [40]: each function can be reasoned about by simply 
unfolding its definition, and the presence of separation be- 
tween objects x and y follows trivially from x and y being 
separate variables. 

Functional programmers have long recognised, and advo- 
cated for, the benefits afforded by reasoning over pure func- 
tions. Cogent puts these benefits directly into the hands of 
verification engineers without the need for a large, untrusted 
run-time system. 

4.4 Assumptions and Invariants 

As mentioned above, the BilbyFs proofs rest on the as- 
sumptions made about the Finux UBI component, as en- 
coded in its axiomatic specification. At present, these as- 
sumptions are a little more restrictive than could be expected 
of a flash memory in practice. In particular, the axiom for 
the ubLwrite function, which takes a buffer and writes it 
to the flash, states that either the entire write succeeds, or 
it fails leaving the flash unchanged. In practice, this write 
may be spread across multiple flash pages, each of which 
may succeed or fail and for which failure may leave the flash 
page only partially-written, or even corrupted [48]. With ad- 
ditional work, these assumptions can be made fully realistic. 

The proofs assume that the file system invariant holds be- 
fore invoking sync() and iget(), and we prove that these 
operations maintain it. The invariant talks about the contents 
of erase-blocks and wbuf, the in-memory buffer that stores 
pending updates. It asserts that the contents of erase-blocks 
and wbuf must form a valid log, i.e., data can be parsed as 
a sequence of valid transactions. After parsing, the logical 
representation of a transaction is a list of file system objects 
such as inode, directory entry and data blocks. The invari- 
ant also says that each transaction has a unique transaction 
number that indicates the order in which transactions must 
be applied when mounting the file system. 

5. Evaluation 

We evaluate COGENT from two perspectives: its ease-of-use 
as a systems language and verification target, and how well 
file systems implemented in COGENT perform. 

5.1 Experience with COGENT 

5.1.1 COGENT as a systems implementation language 

In order to shed light on Cogent’s usability as a systems 
programming language, we briefly describe the experience 


of developing our ext2 and BilbyFs implementations. In 
both cases, we started from a C implementation. In the 
case of ext2, this was Finux’s ext2fs implementation; for 
BilbyFs it was our own implementation of the file system 
that was used to prototype its modular design. The two 
file systems were written by separate developers, but in the 
case of BilbyFs the same developer wrote both its C and 
COGENT implementations. Both developers were already 
familiar with functional programming, the ext2 developer is 
an undergraduate student. 

Naturally, the COGENT language evolved in the process - 
at the time of the initial implementations, the language had 
linear types but no polymorphism and limited support for 
loops. The developers jointly wrote the shared ADT library, 
and the ext2 developer spent considerable time assisting with 
COGENT toolchain design and development. Unfortunately, 
this makes it infeasible to give accurate effort estimates for 
how long each file system would have taken to write had the 
language and toolchain been stable, as they are now. 

Having to adopt COGENT’s functional style was not a 
major barrier for either developer; indeed one reported that 
COGENT’s use of nested let-expressions for sequencing 
and error handling aided his understanding of the potential 
control paths of his code. While both had to get used to the 
linear type system, both reported that this happened quite 
quickly and that the linear type system generally did not 
impose much of a burden when writing ordinary COGENT 
code. Both developers noted the usefulness of COGENT’s 
linear types for tracking memory allocation and catching 
memory leaks. 

Where linear types did cause friction was mostly when 
having to design the shared ADT interfaces to respect the 
constraints of the type system, as mentioned earlier in Sec- 
tion 3.3. Another place where linear types caused friction 
was when implementing the rename () operation in CO- 
GENT. rename ( ) takes two directory arguments, the source 
and target directory of the file to be renamed. In case of re- 
naming a file without changing its directory, these two argu- 
ments are identical. As COGENT does not allow such alias- 
ing, we need two versions of this operation, leading to about 
150 lines of essentially replicated code. 

Both developers reported that the strong type system pro- 
vided by COGENT decreased the time they usually would 
have spent debugging, which is to be expected. Fogic bugs, 
which cannot be captured by the static semantics, can re- 
main in COGENT code and are harder to debug because of 
lack of tool support. The developers, however, found com- 
paratively few bugs in the COGENT code; the vast majority 
of bugs were in the C wrapper code or, to a lesser degree, 
in the C ADT implementations. Both developers reported 
that they felt that the COGENT support for a template-style C 
extension increased productivity in writing ADT implemen- 
tations. Another reported benefit of COGENT’s pluggable 


ADTs was the ability to switch-in different ADT implemen- 
tations to aid debugging. 


System 

native C COGENT generated C 

ext2 

BilbyFs 

4,077 2,789 12,066 

4,021 4,643 18,182 


Table 1. Implementation source lines of code, native vs. 
COGENT, measured with sloccount. Generated line counts 
include ADTs. 

Table 1 shows the source code sizes of the two systems. 
For the native ext2 system (i.e. the Linux code) we exclude 
code that implements features our COGENT implementation 
does not currently support. We can see that for the ext2 sys- 
tem, the Cogent implementation is about 2/3 the size of C. 
BilbyFs’ COGENT implementation is larger than ext2’s rela- 
tive to their respective native C implementations. This is be- 
cause BilbyFs makes much heavier use of the various ADTs 
from the common ADT library, some of which present fairly 
verbose client interfaces in their current implementation. 

The current COGENT language can be viewed as a core 
language: we intentionally kept it minimal, without much 
syntactic sugar or support for common usage patterns. 
Rather than speculating about the optimal set of language 
shortcuts, we implemented the two case studies in the core 
language. This was obviously a source of frustration for both 
developers, but it gave us a better understanding of which 
additions are actually worthwhile. For example, code se- 
quences as depicted in Figure 1, where the error handling 
code simply frees all the memory allocated in the sequence 
so far before returning an error value, are common. The pro- 
grammer should not have to write such boilerplate code, es- 
pecially because the type system provides sufficient infor- 
mation for the compiler to identify the objects that have to 
be destroyed before returning. Language support for this and 
other common patterns can be added fairly easily: as the 
compiler will just desugar these patterns into core language 
constructs, they do not affect the verification or code gener- 
ation at all, and are orthogonal to the issues discussed here. 

The blowout in size of the generated C code in Table 1 
is mostly a result of normalisation steps applied by the CO- 
GENT compiler. Most of this is easily optimised away by the 
C compiler. However, we found that gee’s optimiser does an 
unsatisfactory job of optimising operations on large structs, 
resulting in unnecessary copy operations left in the code. 
This could be addressed by producing more optimised CO- 
GENT output for such cases. 

It is also quite difficult to analyse performance of CO- 
GENT code, as links between assembly output and COGENT 
source code tend to be convoluted. 

5.1.2 COGENT as a verification target 

COGENT was carefully designed to provide a usable verifica- 
tion target for higher-level reasoning on top of its functional 
semantics. Specifically, we took care to ensure its semantics 


could be directly encoded as a pure, side-effect-free function 
in Isabelle’s higher-order logic. 

Recall from Section 2.3 that the COGENT compiler 
proves refinement, which means that reasoning about the 
generated specification is sound with respect to the gener- 
ated C code, in that any property proved of the former is 
guaranteed to hold for the latter. 

The linear type system is critical here, as we can only 
give COGENT a purely functional semantics because the 
linear types limit aliasing. Pure functions are naturally easy 
to reason about, especially by exploiting powerful yet sound 
automated tools like Isabelle’s rewriting engine. 

Compared to the seL4 project [26], the proofs generated 
by the COGENT compiler roughly correspond to the sec- 
ond refinement step from the intermediate executable speci- 
fication to C code (except that the COGENT code is higher- 
level). Klein at al. [26] reports that one third of the total seL4 
verification effort went into that step (not counting re-usable 
libraries and frameworks), so we can confidently predict that 
our co-generation automates at least a third of the overall ef- 
fort of producing a completely verified file system. The sav- 
ings are likely to be higher, as COGENT already proves a 
number of properties like correct pointer typing that in seL4 
needed complex invariants established in the first refinement 
step, from abstract to executable specification [25], 

The verification found six defects in the already tested 
and benchmarked BilbyFs implementation. Three of these 
occurred in serialisation functions, and three in the sync() 
implementation itself. Serialisation and de-serialisation are 
mechanical and tedious to write, which makes them prime 
candidates for further language and proof generation sup- 
port. 

The effort for verifying the complete file system compo- 
nent chain for the functions sync () and iget () in BilbyFs 
was roughly 9.25 person months, and produced roughly 
13,000 lines of proof for the 1,350 lines of COGENT code. 
4.5 person months, «4,000 lines of proof, of these were 
spent on serialisation/de-serialisation functions (ss850 lines 
of Cogent code), which, as mentioned, could be further au- 
tomated. An additional « 1,500 of the ss 13,000 lines of proof 
are libraries. The sync () -specific proof size is just about 
5,700 lines and took 3.75 person months for «300 lines of 
Cogent code. The iget () proofs took 1 person month for 
« 1,800 lines of proof and «200 lines of COGENT code. 

This compares favourably with traditional C-level ver- 
ification as for instance in seL4, which spent 12 person 
years with 200k lines of proof for 8,700 source lines of C 
code. Roughly 1.65 person months per 100 C source lines in 
seL4 are reduced to sa0.69 person months per 100 COGENT 
source lines in COGENT. 

5.2 Cogent performance 

The evaluation platform for the ext2 file system is a four- 
core i7-6700 running at 3.1 GHz, with a Samsung HD501JL 
7200RPM 500G SATA disk, running Linux kernel 4.3.0- 
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Figure 6. IOZone throughput for random 4 KiB writes. 


Figure 7. IOZone throughput for sequential 4 KiB writes. 


1 from the Debian 8.0 distribution. To reduce run-to-run 
variability, we disable all but one core. For the BilbyFs 
evaluations we use a Mirabox with 1 GiB of NAND flash, a 
Marvell Armada 370 single-core 1.2 GHz ARMv7 processor 
and 1 GiB of DDR3 memory, running Linux kernel 3.17 
from the Debian 6a distribution. 

5.2.1 I/O microbenchmarks 

We use the IOZone file system microbenchmarks [20] to 
evaluate basic performance. We use the default settings for 
an automated run, but include the cost of ‘flush’ at the end 
of each write for ext2. 

Figure 6 shows performance on random writes. For Bil- 
byFs, we do not include the cost of ‘flush’ at the end of each 
write since it completely hides the overhead of the COGENT 
implementation. BilbyFs shows a 5% throughput degrada- 
tion in the worst case (64KiB files) for the COGENT version. 
The CPU load is around 20% compared to 15% on the C 
version. This is mostly the effect of redundant memory copy 
operations in the generated C code when passing structs on 
the stack. 

On the other hand COGENT ext2 shows a modest im- 
provement in throughput. CPU usage for COGENT and na- 
tive Linux are the same at around 10%. We used blktrace 
to investigate; it appears that the COGENT implementation is 
slightly slower, which means that disk I/O operations hit the 
disk more often, instead of being merged in the I/O queue. 
We speculate that the on-disk firmware does a better job of 
scheduling disk writes than the Linux CFQ I/O scheduler. 

Figure 7 shows sequential write performance. Again, CO- 
GENT outperforms Linux native for ext2, with very similar 
CPU load (around 10%). Indirect blocks have to be allocated 
at 5 12 KiB and a double-indirect block at 1024 KiB, causing 
the dips at these points. Again, tracing the I/O patterns show 
more, and more frequent, disk I/Os for the COGENT imple- 
mentation. BilbyFs shows a throughput degradation of about 
10% with CPU usage of 20% compared to the 15% of the C 
version. This degradation is explained by the same reasons 
presented for Figure 6. 
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Figure 8. Random write performance on RAM disk. 

In order to identify overheads resulting from the use of 
COGENT, without disk artifacts perturbing the results, we 
re-run the ext2fs benchmarks on a RAM disk, created as 
follows: 

1 modprobe rd rd_ s ize = 1 048576 

2 dd if=/dev/zero of=/dev/ramO bs=lM 

3 mkfs -t ext2 -0 none -r 0 -I 128 -b 1024 \ 

4 / dev/ ramO 

Figure 8 shows that without physical I/O, COGENT is 
slightly slower than native Linux, as expected. This confirms 
that the performance differences observed in the IO bench- 
marks are indeed the result of disk artifacts. The (±8%) error 
bars (showing standard deviation on ten runs) are because of 
CPU contention with other system activity; they are larger 
for Cogent because its slightly longer running time gives 
more opportunity for such contention. 

5.2.2 File-system macro benchmarks 

For a macro benchmark that exposes present limitations of 
COGENT, we run postmark [22], a benchmark that emulates 
a busy mail server by creating and deleting many small files. 
For ext2 tests, we configure the benchmark to start with 
50,000 files of size 10,000 bytes each, and run on a RAM 
disk, so that I/O latency will not mask COGENT overheads. 
BilbyFs results are on a RAM disk that emulates the MTD 


System 

Total time 

sec 

creation 

files/sec 

read rate 
kB/sec 

C ext2 

10 

5025 

248 

Cogent ext2 

21 

2393 

118 

C BilbyFs 

6 

33375 

431 

COGENT BilbyFs 

10 

20025 

259 


Table 2. Postmark run summary, CPU usage is 100% in all 
cases. 


interface, this time on the same i7 machine that we use for 
the ext2 tests. BilbyFs’ hie creation times are much faster 
than ext2’s, so we increased the initial number of hies to 
200,000. 

Table 2 shows the results, each of the values is the mode 
of ten runs. Typically hve or six of the ten runs have essen- 
tially the same performance, the rest is worse as benchmark- 
ing overlaps with unavoidable system activity. We can see 
signihcant degradation of the COGENT implementations, a 
factor around two for ext2 on the RAM disk, and 1.5 times 
for BilbyFs. 

Prohling COGENT ext2 performance shows that most of 
the time is spent in converting from in-buffer directory en- 
tries to Cogent’s internal data type. A better implementa- 
tion of the directory handling code could reduce this prob- 
lem. 

BilbyFs’ bottleneck is in a function that summarises in- 
formation about newly created hies for the log. The same 
function shows as a bottleneck in both C and COGENT ver- 
sions, but in the COGENT version it takes about three times 
as long. 

The underlying reason for these results is that the CO- 
GENT compiler is at present overly reliant on the C com- 
piler’s optimiser. In particular, it passes many structs on the 
stack, which result in much extra copying, which the C com- 
piler fails to optimise away. Further work is needed to gen- 
erate code that is more in line with the C compiler’s optimi- 
sation capabilities. 

6. Related Work 

Functional verification of hie systems belongs to systems 
verihcation in general. Klein [24] gives an overview of the 
early work in this area. The more recent achievements are 
the comprehensive verihcation of the seL4 microkernel [25, 
26], the verihcation stack of the Verisoft project [1, 2], the 
increase of verihcation productivity in CertiKOS [12, 13], 
and the full end-to-end application verihcation in Ironclad 
[15], which builds on a modihed verified Verve kernel [51]. 

We share the vision for end-to-end systems verihcation 
with Ironclad, with verihed hie systems as a major compo- 
nent, but we are not ready to sacrifice performance as they 
do. Although our case studies interface with Linux for mean- 
ingful evaluation, COGENT is designed to integrate with the 
verihcation stack of the seL4 kernel, which will allow a ver- 


ihed COGENT hie system to run in protected isolation from 
legacy applications for a truly minimal TCB. 

Verified File Systems Early Z specihcations of hie systems 
are those by Morgan and Sufrin [33] for UNIX, and Bevier et 
al [7] for a custom hie system. Arkoudas et al [5] verify two 
key operations on the block-level of a hie system, but the 
result remains partial and the authors even argue that sys- 
tem components such as hie systems will probably always 
remain beyond the reach of full correctness proofs. 

Joshi and Holzman [21] proposed a hie system verihca- 
tion challenge, which our work hts into. The aim of this 
paper was not to provide a full hie system verihcation, 
but rather to show that high-level hie system properties are 
possible to prove down to the low-level implementation at 
the same proof productivity as reasoning about higher-level 
specihcations. There is plenty of previous work that shows 
proofs about such high-level specihcations. Our work closes 
the gap to code. For instance, Hesselink and Lali [18] man- 
age to prove a hie refinement stack that goes down to a for- 
mal model, but assumes an inhnite storage device and other 
simplihcations, and does not end in code. The Event B re- 
finement proof by Damchoom et al [9] similarly does not end 
in code. In theory. Event B can generate code from low-level 
models, but neither of these verihcations are close enough to 
achieve usable hie system implementations, let alone high 
performance. 

The most realistic high-level Flash hie system verihcation 
work to date is conducted using the KIV tool [39] and goes 
from the Flash device layer up to a Linux VFS implemen- 
tation [11, 44, 45]. The verihcation work is still in progress 
and the current code generation from low-level models tar- 
gets Scala running on a Java Virtual Machine, which implies 
run-time overheads and dependency on a large language run- 
time. It may be fruitful to investigate a COGENT backend for 
this work. 

Marie and Sprenger [31] investigate the issue of crash 
tolerance in hie systems, and previously Andronick [4] for- 
mally analysed similar issues for tearing in smart cards with 
persistent storage. COGENT does not provide any special 
handling for crash tolerance, but the generated specihcations 
are detailed enough to facilitate reasoning about it. Chen et al 
[8] take cash tolerance to the level of a complete hie system 
in a proof that includes functional correctness of an imple- 
mentation in Coq. This Coq implementation is roughly on 
the same level as the Isabelle/HOL specihcations generated 
out of Cogent. Their work is a strong argument that Co- 
gent is hitting the right level of abstraction. 

Another stream of work in the literature focuses on more 
automatic techniques such as model checking and static 
analysis [14, 41, 52], While in theory these techniques could 
be used to provide similar guarantees as COGENT, this has 
not yet been achieved in practice. Instead of providing guar- 
antees, such analyses are more useful as tools for efficiently 


finding defects in existing implementations. They also do not 
provide a path to further higher-level reasoning. 

7. Conclusions 

We have presented an approach to prove the functional cor- 
rectness of low-level file system code. We build on COGENT, 
a certifying compiler that produces C code, Isabelle/HOL 
specifications, and translation correctness proofs, and we 
prove properties about the generated specifications in Is- 
abelle/HOL. Composed with the compiler correctness proof, 
we obtain proofs about the C code. 

We find that COGENT not only allows non-experts in 
formal verification to write provably-safe, fully realistic file 
system code, it is also the key step to lowering the effort 
and complexity for the full mechanical verification of file 
systems against high-level formal specifications. 

Our evaluation shows that on a real disk our present file 
system implementations in COGENT have almost identical 
throughput with their C counterparts, with slightly more 
CPU usage. Furthermore, we find that this degradation can, 
in many cases, be avoided by improving the COGENT com- 
piler output, such that it is less dependent on the optimiser 
of the underlying C compiler. 

Cogent is a sequential language. Although it allows 
asynchronous disk access, obvious future work is full con- 
currency. While not a trivial transition, we believe the lin- 
ear type system of COGENT will help: it lets the compiler 
keep track of memory locations and side effects. Moreover, 
the high-level functional semantics can already be executed 
fully concurrently — sequentialisation happens through the 
explicit states that the programmer can see and pass through 
the program in COGENT, making interaction points ex- 
plicit. Given additional suitable synchronisation primitives, 
if those states become more fine-grained, more concurrency 
becomes available. 
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